-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for safetensors
and LoRa.
#2448
Conversation
The documentation is not available anymore as the PR was closed or merged. |
I don't think the failing test is linked to this PR, is it ? |
Hi @Narsil! The code looks good to me, and the failing tests have nothing to do with it. For additional context on top of the Focusing on the first task (converting from other tools to diffusers), I'm not sure how hard the problem is. In addition to changes in key names, some tools seem to do pivotal tuning or text inversion in addition to the cross-attention layers in our implementation. This PR seems to save the full pipeline instead of attempting to convert the incremental weights. TL;DR: I think this PR is quite useful and necessary, but not sure if it will help towards the issue you mentioned :) (But I may be wrong, I still have to find the time to test this sort of interoperability). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
I think the current approach is reasonable – if the user supplies a custom name on save, they have to provide it on load too, and we'll try safetensors first.
Maybe @sayakpaul or @patil-suraj would like to take a look too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Shall I merge ? |
Please merge. |
I would like to have @patil-suraj also review it. But he is currently on leave and should be back early next week. If it can wait, I would like to wait for that while. |
src/diffusers/loaders.py
Outdated
@@ -219,7 +250,10 @@ def save_attn_procs( | |||
return | |||
|
|||
if save_function is None: | |||
save_function = torch.save | |||
if safe_serialization: | |||
save_function = safetensors.torch.save_file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we save the format here as well as this is needed in some loading code?
E.g. see updated code here:
safetensors.torch.save_file( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Can we also save:
metadata={"format": "pt"}
as well here?
Done. |
Hey all, thanks for all the amazing work here. I need to spend a bit more time on this but I think this introduces a regression. I have an integration test that calls: // https://huggingface.co/patrickvonplaten/lora_dreambooth_dog_example/resolve/main/pytorch_lora_weights.bin
pipe.unet.load_attn_procs("pytorch_lora_weights.bin") and is now failing with:
I guess because it's trying to load a non-safetensors file with diffusers/src/diffusers/loaders.py Lines 153 to 170 in 1f4deb6
There's no exception raised because the file indeed exists, it's just not in safetensors format. So I guess we need a Looking at the surrounding code the easy way around this is to call Looking back at the original PR intro I guess this is exactly what @Narsil says:
So should I open a new issue for this? Thanks! |
Ooops ! Thanks for notifying. I created a fix here: #2551 |
So fast! Thanks, @Narsil! 🙏 |
|
@Ir1d can you provide a reproducible workflow (ideally fast to execute) ? |
@Narsil would something like this work now?
|
I tried this in my env but it's not working. |
Do you have links to the Here is a modified version of your scripts that creates the proper LoRA safetensors file: from diffusers import *
import torch
from diffusers.models.attention_processor import AttnProcessor, LoRAAttnProcessor
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
model = pipe.unet
lora_attn_procs = {}
for name in model.attn_processors.keys():
cross_attention_dim = None if name.endswith("attn1.processor") else model.config.cross_attention_dim
if name.startswith("mid_block"):
hidden_size = model.config.block_out_channels[-1]
elif name.startswith("up_blocks"):
block_id = int(name[len("up_blocks.")])
hidden_size = list(reversed(model.config.block_out_channels))[block_id]
elif name.startswith("down_blocks"):
block_id = int(name[len("down_blocks.")])
hidden_size = model.config.block_out_channels[block_id]
lora_attn_procs[name] = LoRAAttnProcessor(
hidden_size=hidden_size, cross_attention_dim=cross_attention_dim
)
lora_attn_procs[name] = lora_attn_procs[name].to(model.device)
# add 1 to weights to mock trained weights
with torch.no_grad():
lora_attn_procs[name].to_q_lora.up.weight += 1
lora_attn_procs[name].to_k_lora.up.weight += 1
lora_attn_procs[name].to_v_lora.up.weight += 1
lora_attn_procs[name].to_out_lora.up.weight += 1
model.set_attn_processor(lora_attn_procs)
model.save_attn_procs("./out", safe_serialization=True)
model.load_attn_procs("./out", use_safetensors=True) ###model_path = 'xxx.safetensors' This should work ? HeaderTooLarge error seems to indicate the file you're having is corrupted in some way or isn't a safetensors file to begin with. |
So our current workflow is use convert_lora_safetensor_to_diffusers.py to merge a lora to its base model, then if we want to separate it and use it like a native lora in diffuser we use this script? @Narsil
|
Sorry I'm not familiar with this workflow nor this particular. Do you have some script+workflow I could reproduce to try and reproduce the faulty file ? My guess is that something went wrong during conversion leading to a bad file since everything look relatively straightforward in that script. |
Sorry for misunderstanding, I'm trying to use a lora/ckpt from civitai with diffusers, and I wonder what's the correct way. Then I would like to use it with a lora, the suggested way seems to be
but that raised an KeyError: 'to_k_lora.down.weight'. https://github.com/huggingface/diffusers/blob/main/scripts/convert_lora_safetensor_to_diffusers.py can merge it with its base model, this works but gave me a huge model, which cannot work easily with other base models. I wonder if we need to fuse it with the base model first and save it in the diffuser format using the script above in order to obtain a lightweight lora replica(like it original be). |
This means the LoRA is still in SD format, and you need to change it to |
* Adding support for `safetensors` and LoRa. * Adding metadata.
* Fix regression introduced in huggingface#2448 * Style.
I want to know how to do it? |
Duplicate of #2551 (comment) => let's make this a feature request |
* Adding support for `safetensors` and LoRa. * Adding metadata.
* Fix regression introduced in huggingface#2448 * Style.
# [1.5.0](v1.4.0...v1.5.0) (2023-05-24) ### Bug Fixes * **app:** async fixes for download, train_dreambooth ([0dcbd16](0dcbd16)) * **app:** diffusers callback cannot be async; use asyncio.run() ([7854649](7854649)) * **app:** up sanic RESPONSE_TIMEOUT from 1m to 1hr ([8e2003a](8e2003a)) * **attn_procs:** apply workaround only for storage not hf repos ([b98710f](b98710f)) * **attn_procs:** load non-safetensors attn_procs ourself ([072e7a3](072e7a3)), closes [/github.com/huggingface/diffusers/pull/2448#issuecomment-1453938119](https://github.com//github.com/huggingface/diffusers/pull/2448/issues/issuecomment-1453938119) * **deps:** pin websockets<11.0 for sanic ([33ae2f4](33ae2f4)) * **inference:** return $error NO_MODEL_ID vs later crash on None ([46ea977](46ea977)) * **storage:** actually, always set self.status (default None) ([c309ca9](c309ca9)) * **storage:** don't set self.status to None ([9b88b80](9b88b80)) * **storage:** extract with dir= must not mutate dir (download, logs) ([b1f8f87](b1f8f87)) * **tests:** pin urlllib3 to < 2, avoids break in docker package ([ccf8231](ccf8231)) ### Features * **app:** run pipeline via asyncio.to_thread ([e87f7e7](e87f7e7)) * **attn_procs:** from_safetensors override, save .savetensors fname ([5fb6487](5fb6487)) * **cors:** add sanic-ext and set default cors-origin to "*" ([eb2a385](eb2a385)) * **diffusers:** bump to 0.15.0 + 2 weeks with lpw fix (9965cb5) ([77e9078](77e9078)) * **diffusers:** bump to latest diffusers, 0.14 + patches (see note) ([48a99a5](48a99a5)) * **download:** async, status; download.py: use download_and_extract ([bb7434a](bb7434a)) * **HTTPStorage:** store filename from content-disposition ([2066c44](2066c44)) * **loadModel:** send loadModel status ([db75740](db75740)) * **status:** initial status work ([d1cd39e](d1cd39e)) * **storage:** support misc tar compression; progress ([a8c8337](a8c8337)) * **stream_events:** stream send()'s to client too ([08daf4f](08daf4f))
* Adding support for `safetensors` and LoRa. * Adding metadata.
* Fix regression introduced in huggingface#2448 * Style.
* Adding support for `safetensors` and LoRa. * Adding metadata.
* Fix regression introduced in huggingface#2448 * Style.
Enabling safetensors support for the LoRA files:
Asked here: huggingface/safetensors#180
Same attitude as the regular model weights.
If:
then it is the default.
Adding
safe_serialization
on lora create so users can default to saving safetensors formats.What's techically missing is the option to choose on load what the format is along with the weights_name. I ddin't want to add it here for simplicity (Since most users should be using the default anyway). But we could add that.